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yVBS TRACT 

PROBLEM TO BE SOLVED: To select \ retrieving method with highest retrieval 
efficiency out of plural /retrieving methods corresponding to retrieval 
conditions by providing a screcif ieo\retrieving method selecting means. 

SOLUTION: A retrieving method selecting means 1 selects one retrieving 
method out of plural retrieving methods corresponding to the retrieval 
conditions including a / retrieval pattern , namely , the number of 
keywords as character / strings. In th\s case, . when the length of the 
shortest keyword is smort, an AC method re adopted, for example, and when 
the length of the shor/test keyword is comparatively long and there is only 
one keyword, a BM method or a Sunday meutood is selected, for example. 
Further, when the length of the shortest keyword is comparatively long and 
there are plural keywords , the value discrimination function 

depending on the number and length of these keywords is found and the AC 
method or a FAST method is used corresponding to that value. Thus, the 
retrieving method considered optimum is automatically selected 
corresponding to the retrieval conditions such as the number of keywords 
and the length of the shortest keyword. 
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Operating a digital computer, involves providing data type and size for 
converting text values to data type corresponding to the text keywords in 
template file for each text keyword and text value pair 

Patent Assignee: INT BUSINESS MACHINES CORP (IBMC ) 

Inventor: DIEDRICH R A; EVANS S T; FINKENAUR J K 

Number of Countries: 001 Number of Patents: 001 

Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

US 6173288 Bl 20010109 US 96654989 A 19960529 200134 B 

US 9885630 A 19980527 



Priority Applications (No Type Date): US 96654989 A 19960529; US 9885630 A 

19980527 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 

US 6173288 Bl 8 G06F-017/21 Div ex application US 96654989 

Div ex patent US 5787450 

Abstract (Basic) : US 6173288 Bl 

NOVELTY - An input string containing pairs of text keywords and 
text values is received. The data type and size for converting these 



. text values to the data type corresponding to the text keywords are 
provided in a template file for each text keyword and text value 
pair. A data structure of converted values without keywords are 
then built based on the resulting template. 

DETAILED DESCRIPTION - INDEPENDENT CLAIMS are also included for the 
following: 

(a) a digital computer; 

(b) a memory element for storing digital signals operable to 
control a digital computer 

USE - Operating a digital computer. 

ADVANTAGE - Returned data can be referenced by structure member 
name, and the data is returned in useable types. Provides method and 
programming structure for creating a data structure comprising a 
nonlinear data object with typed data fields and field names from 
a common gateway interface type input string. 

DESCRIPTION OF DRAWING (S) - The figure shows the flow diagram of 
the digital computer operating method. 

pp; 8 DwgNo 4/4 
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Text file searching system in internet - has keyword filter to 
selectively delete attribute name index when its repetition is 
detected 
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Number of Countries: 001 Number of Patents: 002 
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Abstract (Basic) : JP 11306205 A 

NOVELTY - File search demand in natural language expression is 
investigated to acquire attribute name index and attribute value 
, using which a keyword for searching, is extracted. A filter (5) 
selectively deletes attribute name index when its repetition is 
detected. Attribute value and name index from filter are then used 
to search the required file. DETAILED DESCRIPTION - An INDEPENDENT 
CLAIM is also included for recording medium storing text file searching 
program. 

USE - For searching text file such as XML in internet by natural 
language expression search inquiry. 

ADVANTAGE - Redundancy of reply corresponding to search demand is 
eliminated by keyword filter. User desired file can be retrieved 
easily, by natural language expression demand. DESCRIPTION OF 
DRAWING (S) - The figure shows the block diagram of text file searching 
system. (5) Keyword filter. 
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Desired database selection apparatus of database system - computes 
expected value which satisfies search key word required to select 
desired database from extracted field name 
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Abstract (Basic) : JP 11238080 A 

NOVELTY - A field name corresponding to a word in a given 
search key word is extracted from registered field names . A 
calculation circuit (17) computes an expected value which satisfies a 
search key word required to select a desired database, from extracted 
field name . 

USE - For choosing database of user's desire from several 
databases . 

ADVANTAGE - Performs reliable search of database containing records 
which satisfies user demand even when given search key word is not 
registered. DESCRIPTION OF DRAWING (S) - The figure is functional block 
diagram showing internal components of database selection apparatus. 
(17) Calculation circuit. 
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Presentation support system for creating images for display with 
presentation materials - forms presentation materials based upon various 
controlled data and adds background data e.g images or sounds to 
presentation materials 
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Abstract (Basic) : US 5459829 A 

The presentation support system forms presentation materials based 
on various collected data and adds background data e.g images or sounds 
to the presentation materials. On the basis of, e.g., input items, 
attribute values, and titles, their categories are analyzed by using a 
proper noun dictionary, a concept dictionary, and a numeric attribute 
name dictionary. On the basis of the analyzed categories, changes in 
attribute values are analyzed in accordance with inference rules , and 
a keyword for describing a background state of presentation is 
extracted . 

A background material suitable for presentation is selected from 
background materials such as images and sounds in accordance with the 
extracted keyword. The selected background material is displayed in 
combination with a graph which is formed in accordance with content 
data about an object to be presented. 

USE/ADVANTAGE - Provides structure generating appts which can 
easily generate various types of graphic structures, and can represent 
action patterns of moving objects which are entirely or locally 
different from each other using small amount of data without describing 
patterns in different programs. 
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ABSTRACT 



PROBLEM TO BE SOLVED: To provide a document retrieving device which needs 
only a low cast for retrieval and can reduce omission of retrieval due to a 
character recognition error. 

SOLUTION: This document retrieving device 451 is provided with a 1st 
deciding means 401 that decides whether or not at least a part of a 
keyword coincides with at least a part of recognition results by 
comparing character codes, a 1st noncoincident character specifying means 
4 02 which specifies a 1st character that does not coincide with the 
recognition results in at least one 1st character included in the 
keyword as a 1st noncoincident character when a part of the keyword 
coincides with at least a part of the recognition results, a 2nd 
noncoincident character specifying means 402 which specifies one or two 
continuous 2nd characters or more having a width being the closest to 
the width of the 1st noncoincident character as a 2nd 'noncoincident 
character in at least one 2nd character included in the recognition 
results, and a 2nd deciding means 402 that decides whether or not the 1st 
noncoincident character coincides with the 2nd noncoincident character by 
comparing the character quantity of the image of the 1st noncoincident 
character with the characteristic quantity of the image of an area 
including one or two partial areas or more allocated to one or two 
continuous 2nd characters or more including in the 2nd noncoincident 
character . 
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ABSTRACT 



PROBLEM TO BE SOLVED: To automatically generate index information for a 
document that is actually prepared, edited or browsed by a user itself by 
extracting attribute information about the document according to set 
timing, generating the index information and storing it while corresponding 
to the document . 

SOLUTION: Timing when index information for retrieving stored documents is 
prepared is set, attribute information about a document is extracted 
according to the set timing, and the index information is made to 
correspond to the document and stored. In this device, a registration 
operation setting part 2 sets the operation of a browsing part 1 which 
becomes timing when the generation of index information for a document 
prepared, edited and browsed by the part 1 is started. An index information 
generation part 7 generates index information for the document browsed or 
generated and edited by the browing part 1 on the basis of a keyword and 
various attribute values obtained by a keyword extracting part 5 and 
an attribute acquisition part 6. 
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DOCUMENT FILE RETRIEVAL DEVICE AND MACHINE READABLE RECORDING MEDIUM 
RECORDING PROGRAM 

PUB. NO.: 11-306205 [ JP 11306205 A] 

PUBLISHED: November 05, 1999 (19991105) 
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ABSTRACT 

PROBLEM TO BE SOLVED: To realize a retrieval inquiry about a WWW home page 
by a natural language. 

SOLUTION: A WWW home page being a retrieval object document file is 
described in an XML. When a retrieval condition composition is inputted, a 
keyword extraction part 4 converts a natural language expression 
expressing an attribute name into an attribute name index including 
the attribute name and also converts the natural language expression 

expressing the attribute value into an attribute value index including a 
pair of the said attribute name and attribute value . A keyword 
filter part 5 deletes the attribute name index existing at a place 
where the attribute name and the attribute value of the same 

attribute exist adjacent to each other in a converted index string. A 
document contents check part 6 checks whether or not a tag corresponding to 
pairs of attribute name and value of the all attribute value 

index exists in the retrieval object document file. If the said tag exists, 
a document contents output part 9 retrieves and outputs the attribute 
value of the tag having the relevant attribute name of the 

attribute name index that is included in the converted index string. 
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DEVICE AND METHOD FOR DATA RETRIEVAL AND STORAGE MEDIUM STORING DATA 
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ABSTRACT 



PROBLEM TO BE SOLVED: To smoothly retrieve the temporary data which are not 
desired to be registered into a data base by preparing a 2nd storage means 
that stores a 2nd data base containing the attribute information on the 
temporarily registered data in addition to a 1st data base that is stored 
in a 1st storage means. 

SOLUTION: When the retrieval instruction given from an input device 
designates also the temporary data as a retrieval object, a CPU 101 selects 
the temporary data to be retrieved, e.g. an image file that is stored in an 
external storage 107 such as a CD-ROM, etc., and reads the image data out 
of the selected image file. Then the CPU 101 extracts and generates the 
image feature value and keywords , i.e., the attribute information 
from the image data stored in the image file of the storage 107 to register 
them in a memory in the same method as that which is applied when the 
attribute information is registered in a normal data base. In such cases, 
a new data base is prepared in a RAM 106 to store the temporary data and 
the attribute information and their related image data are registered in 
this data base. 
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THESAURUS DICTIONARY SYSTEM HAVING NUMERIC VALUE DECISION FUNCTION 
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ABSTRACT 

PROBLEM TO BE SOLVED: To convert a quantitative key word into a qualitative 
key word. 

SOLUTION: A set of key word that are mutually synonymous is one record, and 
many records Rl, R2, . . . are prepared as a thesaurus dictionary. Each record 
prepares a representative key word and a synonymous key word as a 
qualitative key word and a conditional expression key word as a 
quantitative key word . When a character string that is 1 maximum blood 
pressure = 180 1 is given, a conditional expression key word that 
includes an attribute part that is 'maximum blood pressure' is sought 
through retrieval. A conditional expression that is 'maximum blood pressure 
> 150 1 is retrieved from the record Rl, and a conditional expression that 
is 'maximum blood pressure < 100 1 is retrieved from the record R2 . When 
each conditional expression is substituted by a numeric value 'ISO 1 , and 



when a representative key word ! high blood pressure 1 of the record Rl whose 
condition is satisfied is outputted, a quantitative key word 'maximum blood 
pressure = 180' is converted into a qualitative key word 'high blood 
pressure 1 . 
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ABSTRACT 

PROBLEM TO BE SOLVED: To reduce the capacity of document indexes and to 
improve the efficiency of retrieval corresponding to the request of inquiry 
with document retrieval conditions by uniquely recognizing stored documents 
and using a smaller sized 'document number 1 than a line identifier for 
document indexing. 

SOLUTION: While receiving a retrieval request 1 from the source of inquiry 
with a conditional expression and a keyword corresponding to the 
attribute value of data and referring to a document index 142 prepared 
corresponding to a document 145 based on the keyword, the document number 
of a document object containing the keyword is possessed. A record 
identifier 51 of entry of a conversion table 141 corresponding to the 
document No., is possessed. The document object containing the keyword of 
the retrieval request 1 is related through data 144 to the line of that 
conversion table 141. While using an index 143 prepared corresponding to 
the attribute value of data contained in the conditional expression of 
the retrieval request 1, the record ID of the line coincident with the 
conditional expression is possessed and the cluster of record ID is 
narrowed down while using the record ID provided from the index 143. 
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ABSTRACT 



PURPOSE: To shorten constitution time and facilitate the operation by 
searching the inside of an existent document file on the basis of 
prescribed key information, and duplicating and analyzing necessary 
information and automatically extracting it. 

CONSTITUTION: A document file searching means 3 searches the document file 
selected by a document file selecting means 2 on the basis of a key word, 
etc., for a date, a department, etc. A document file information 
duplicating means 4 duplicates document file information such as 
explanation, a paragraph, and a document of a document file that a basic 
point position belongs to for data base constitution. Here, the base point 
indicates a position in the document file found with the specified key 

word . An attribute value calculating means 5 analyzes the document 
file information duplicated for data base constitution by the document file 
information duplicating means 4 and calculates necessary attribute 

values , etc. A data base constituting means 6 constitutes a data base 8 
on the basis of the data generated by the document file information 
duplicating means 4 and the attribute values , etc., found by the 

attribute value calculating means and a data base correcting means 7 

corrects it. 



31/5/10 (Item 10 from file: 347) 

DIALOG (R) File 347:JAPIO 

(c) 2004 JPO & JAPIO. All rts. reserv. 

03954670 **Image available** 
ELECTRONIC FILING DEVICE 



PUB. NO. : 
PUBLISHED: 
INVENTOR (s) : 
APPLICANT (s) 

APPL. NO. : 
FILED: 
INTL CLASS: 
JAPIO CLASS: 
JOURNAL : 



04-319770 [JP 4319770 A] 
November 10, 1992 (19921110) 
UCHIYAMA TORU 

FUJI XEROX CO LTD [359761] (A Japanese Company or 
Corporation), JP (Japan) 
03-086855 [JP 9186855] 
April 18, 1991 (19910418) 
[5] G06F-015/40 

45.4 (INFORMATION PROCESSING — Computer Applications) 
Section: P, Section No. 1509, Vol. 17, No. 149, Pg. 125, 
March 24, 1993 (19930324) 



ABSTRACT 

PURPOSE: To efficiently retrieve a file by a keyword. 

CONSTITUTION: A keyword file(KF) is registered in a specified position of 
file tree structure by a KF registering means 2 and stored in a file 
storing means 1. When the storage of a KF-added file is instructed, a file 
storing means 3 validates the KF only when a specified file storing 
position is in hierarchy lower than the registered position, extracts the 
attribute of the keyword and its value from the KF, applies the 
extracted results to the file, and stores the the extracted results in the 
specified position of the tree structure in the means 1. When a retrieving 
range indicating which positional range in the tree structure is to be 
retrieved is specified, a file retrieving means 4 refers the position in 
the KF tree structure registered in the means 2, retrieves KF names 
included in the specified retrieving range and displays the retrieved 
results on a display mean 6 through a display control means 5. 
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ABSTRACT 

PURPOSE: To attain the character processing tasks at a high speed such as 
the recognition of character codes, the extraction of key words , etc., 
by performing the comparison of character codes at a high speed and making 
even all character codes in two bytes. 



CONSTITUTION: An input register 1 is provided together ' with a 1-byte 
register 2, a character code detector 3, a 1st register 4, a 1st AND 
gate 5, a 2nd AND gate 6, an SR flip-flop 7, a selector 8, an output 
register 9, a 2nd register 10, a JK flip-flop 11, and a 4-input OR gate 12. 
A special code is previously stored to show the start/end of insertion of a 
2-byte code character, and the detector 3 detects at a high speed the input 
of the special code. Thus the special code is excluded at a high speed out 
of the character codes and all character codes are made even into two 
bytes. As a result, the character strings can be processed at a high speed. 
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ABSTRACT 

PURPOSE: To omit the index of key words and to avoid the increase of the 
file capacity by handling integratedly the key words and the 

attributes other than these key words - 

CONSTITUTION: When data are transferred to a data value memory part 11 
from external, the key words contained in the data are coded by a key word 
coding pat 12. These coded key words are combined' at a key word 

attribute value calculating part 13 for acquisition of the same form as 
other attributes and stored in a disk device 15 via a data storing part 
14. Therefore the key word attribute value can be used when the 
conditions related to the key words are designated and data are extracted. 
Thus the index of key words is not required and the necessary disk capacity 
is reduced. 
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Method of finding irregular phrases provides a highly efficient and 
user-friendly search tool for finding irregular phrases with the same 
attributes 

Patent Assignee: INVENTEC CORP (INVE-N) 

Inventor: WANG D 

Number of Countries: 001 Number of Patents: 001 
Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

TW 377416 A 19991221 TW 97119386 A 19971219 200044 B 

Priority Applications (No Type Date) : TW 97119386 A 19971219 
Patent Details : 

Patent No Kind Lan Pg Main IPC Filing Notes 
TW 377416 A 17 G06F-017/30 

Abstract (Basic) : TW 377416 A 

This invention relates to a highly efficient and user-friendly 
search tool for finding irregular phrases with the same attributes . 
User inputs, a phrase or characters string on computer screen for a 
keyword search based on the search rules for keyword attributes . 
Keywords in the database sharing some common attributes are encoded, 
and a plurality of index tables is created on the basis of the 
relation of these attributes . A rapid search is accomplished through 
index tables by means of a reverse-exclusion algorithm, which carries 
out a repetitive process of rejecting those keywords in the keyword 
database that do not match the special attributes , until a list of 
relatively few keywords having the same attributes is obtained. It 
is then possible to perform a detailed comparison among these selected 
keywords to find the desired keyword. 
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Communication service procedure involves calculating degree of 
relationship between specific user and other users based on 
characteristic value of corresponding keywords registered beforehand 
by respective users 

Patent Assignee: NIPPON TELEGRAPH & TELEPHONE CORP (NITE ) 

Number of Countries: 001 Number of Patents: 001 

Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

JP 2000132509 A 20000512 JP 98306017 A 19981027 200034 B 

Priority Applications (No Type Date) : JP 98306017 A 19981027 
Patent Details : 

Patent No Kind Lan Pg Main IPC Filing Notes 
JP 2000132509 A 15 G06F-015/00 

Abstract (Basic) : JP 2000132509 A 

NOVELTY - Degree of relationship between an user and other users 
currently involved in network communication, is calculated based on 
characteristic value corresponding to keyword previously 



registered by the user. The relevant user identifications are displayed 
on the screen in response to the calculated degree of relationship. 
Information on a particular user is displayed by selecting suitable 
icon on the screen. 

DETAILED DESCRIPTION - The keyword registered beforehand by the 
user is acquired as the user 's individual characteristic value . 
The keyword is assigned a priority level during processing. 
INDEPENDENT CLAIMS are also included for the following: 

(a) communication service system; 

•(b) program for communication service procedure 

USE - For communicating with an individual or arbitrary companions 
belonging to specified group or unspecified group for community 
creation assistance on internet. 

ADVANTAGE - Existence of each of the user on a network and their 
correlation are recognized and each individual information can be 
referred . 

DESCRIPTION OF DRAWING (S) - The figure explains the steps involved 
in the communication service procedure, 
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Patent No Kind Lan Pg Main IPC Filing Notes 
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Abstract (Basic) : JP 2000089991 A 

NOVELTY - A controller (4) in server (1) transfers information from 
text file (54), document information management database (51) and 
keyword index file (52) depending on proceeds demand containing, 
document registration and search. A controller (6) in client (2) 
transfer information with controller (4) and registration file (62) 
which specifies registration document. 

DETAILED DESCRIPTION - Document entity to identify a document is 
stored in the text file and pointers to refer these entities are 
defined based on folder hierarchy. The attribute value of each 
document is stored in document information management database. Keyword 
index for high speed search ability is tared in the keyword index file. 
The document search is performed by controller in server with 
multimedia index file (53) . 

USE - In multimedia to register and search still picture image. 

ADVANTAGE - Multi-hierarchization of folder is enabled ad is 
displayed automatically by the keyword included in attribute value 



of documents. 

DESCRIPTION OF DRAWING (S) - The figure shows the document control 
system. 

Server (1) 
Client (2) 
Controllers (4,6) 

Document information management database (51) 

Keyword index file (52) 

Multimedia index file (53) 

Text file (54) 

Registration file (62) 
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Electronic dictionary search apparatus for electronic notebook, has 
indicator which indicates to perform stoppage of scroll up and down 
operation of screen, and decides index word, information on stopped 
screen 

Patent Assignee: SHARP KK (SHAF ) 

Number of Countries: 001 Number of Patents: 002 

Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 
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Abstract (Basic) : JP 2000067059 A 

NOVELTY - Search unit displays and searches dictionary screen for 
searching key word list screen and dictionary information for 
searching index search screen to search index word. Indicators indicate 
scroll up-down operation of each screen among a search unit. Another 
indicator indicates stoppage of scroll up-down operation of screen, and 
decides index word, keyword or dictionary information on stopped 
screen. DETAILED DESCRIPTION - The search apparatus has dictionary 
table (5a) to store dictionary information about a keyword and the 
initial of the keyword is stored as index. A lead character 
followed by two or more characters is stored as index word in index 
table - An INDEPENDENT CLAIM is also included for memory medium. 

USE - For portable terminal, electronic notebook. 

ADVANTAGE - Key word need not be input by the operator , since 
the search apparatus has dictionary table from which dictionary 
information can be found using search buffer. DESCRIPTION OF DRAWING (S) 
- The figure shows the block diagram of electron dictionary search 
apparatus. (5a) Dictionary table. 
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Object comparison optimization method for visual information retrieval 
systems in environmental imaging, medicine, multimedia and digital image 
management 

Patent Assignee: VIRAGE INC (VIRA-N) 
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Abstract (Basic) : US 5913205 A 

NOVELTY - A cost is assigned for executing comparison process for 
each visual primitives of any one of the feature vectors. Then the 
comparison process is applied between the selected visual primitives of 
any two feature vectors. 

DETAILED DESCRIPTION - The visual primitives which are ordered from 
minimum to maximum cost are further ordered by a set of primitive 
weights. The cost is set by assigning a predetermined computation cost 
to the selected primitive and a computation cost for each target 
primitive set relative to the selected primitive cost. The target 
primitives are the non- selected primitives in a predefined schema of 
primitives. The visual primitives and the associated costs are 
registered with a search engine. AN INDEPENDENT CLAIM is also included 
for object comparison optimization apparatus. 

USE - For visual information retrieval systems in environmental 
imaging, medicine, multimedia and digital image management. 

ADVANTAGE - The content extraction results in very high information, 
compression as an image file contents may be expressed in as little as 
several hundred bytes of memory, regardless of the original image 
size. High level problem such as automatic, unsupervised keyword 
assignment or image classification can be addressed using the 
infrastructure provided by the visual information retrieval (VIR) 
engine. 

DESCRIPTION OF DRAWING (S) - The drawing shows the block diagram of 
the modules of a visual information retrieval (VIR) system, 
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Thesaurus dictionary system for database search using keywords - searches 
conditions corresponding to input keyword showing specific numerical 
value with predetermined attribute based on which, synonym of that 
keyword is detected as another keyword 

Patent Assignee: DAINIPPON PRINTING CO LTD (NIPQ ) 

Number of Countries: 001 Number of Patents: 001 

Patent Family: 

Patent No Kind Date Applicat No Kind Date Week 

JP 10222517 A 19980821 JP 9734469 A 19970203 199844 B 

Priority Applications (No Type Date) : JP 9734469 A 19970203 
Patent Details: 

Patent No Kind Lan Pg Main IPC Filing Notes 
JP 10222517 A 9 G06F-017/30 

Abstract (Basic) : JP 10222517 A 

The system uses keywords that are put together in groups according 
to connections between their meanings. The numerical value with 
predetermined attribute , based on which keywords are grouped, is 
recorded. 

When a keyword showing a specific numerical value is input, the 
corresponding conditions are searched. The synonym of the input keyword 
matching the searched condition is detected as another keyword for the 
same meaning of the input keyword. 

ADVANTAGE - Converts quantitative keyword to qualitative keyword. 
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Computer based data structure generation method - involves converting 
value of text to type of data of matching field and writing converted 
value into output buffer provided at specific location in data structure 
with respect to matching field 
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Abstract (Basic) : US 5787450 A 

The method involves generating an ordered list of fields defined 
for each keyword , type and size of data, in a template file. An input 
string received from a calling program is used to read a keyword 
pair in the file. A field name matching the read keyword pair 
is searched in the template file. 



The type of data is determined for the matching field name from the 
template file. The value of text is converted to type of data of the 
matching field. The converted value is written in an output buffer 
which is at a predetermined location in the data structure 
corresponding to the matching field. 

ADVANTAGE - Uses several memory elements which stores digital 
signal used for operating computer. 
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Converting AS/400 display files from character based display format to 
GUI display format - analyses and converts each of screen fields to 
related GUI window component (s) and converting attributes of field 
into GUI component properties 
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Abstract (Basic) : RD 410081 A 

The compiled character based screen specifications are found in a 
binary object known as AS/400 display file. These specifications are 
taken as input by the conversion program. 

Each of the screen fields are analysed and converted to related GUI 
window component (s) . The attributes of a field are converted into 
GUI component properties. Push buttons are generated for all function 
keys encoded. For example, if a display file has an input/output entry 
field that is defined as being twenty characters long, accepting 
character data only, and using the 11 keyboard protect 11 attribute , 
DSPATR ( PR) , then that field is converted into a GUI component of type 
11 Entry field' 1 which accepts up to twenty characters of data, and has 
a GUI property of ' ' Read Only 1 ' . 

Alternatively, if a field uses the keyword 1 1 VALUE 1 ' which 
assigns a limited list of valid values which a user may type into a 
field, then that field is converted into a 1 1 Combination box 1 1 GUI 
component, with the limited list of values built into it. 

The output of the program is a GUI description of the entire screen 
in a text file shown below. 
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Authoring system for multi -media information including sound information 
- has recording unit, index generator, search input key for desired image 
or sound, and results output unit 
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Abstract (Basic) : WO 9709683 A 

The authoring system has at least a retrieval key-inputting device 
through which a retrieval key such as a key word or an attribute 
value is inputted, retrieval result outputting device which outputs 
the retrieved sound information or moving picture, multimedia 
information retrieving device which retrieves multimedia information 
including sound information and moving picture information, and index 
generating device which generates indexes representing the 
correspondences between sound information and the moving picture 
information with respect to multimedia information including sound 
information . 

A desired moving picture or sound information can be readily 
retrieved from other corresponding information. 

ADVANTAGE - Retrieval of moving picture or sound information from 
video information including sound information is facilitated using 
portable information terminal such as PDA (Personal Digital Assistant) 
notebook computer, or using multimedia terminal such as personal 
computer or workstation. 

Dwg.1/11 

Title Terms: SYSTEM; MULTI; MEDIUM; INFORMATION; SOUND; INFORMATION; RECORD 
; UNIT; INDEX; GENERATOR; SEARCH; INPUT; KEY; IMAGE; SOUND; RESULT; 
OUTPUT; UNIT 

Derwent Class: T01; W04 

International Patent Class (Main) : G06F-017/30 
File Segment: EPI 



31/5/24 (Item 12 from file: 350) 

DIALOG (R) File 350: Derwent WPIX 

(c) 2004 Thomson Derwent. All rts. reserv. 

008585265 **Image available** 

WPI Acc No: 1991-089297/199113 

XRPX Acc No: N91-069031 

Voice information service system - enables inputting of desired specific 
character by selecting two buttons from touch-tone type telephone and 
pressing them in given sequence 

Patent Assignee: KOREA TELECOM AUTH (KOTE-N) ; KOREA TELECOM AUTHORITY 
(KOTE-N); KANKOKU DENKI TSUSH (KANK-N) ; KOREA TELECOM CORP (KOTE-N) 

Inventor: LN D J; LN K E; RAK L J; EUNG IN K; JAE IN K; JONG RAK L; KIM J; 



.LEE J; KIM E; KIM E I; KIM J I; LEE J R 
Number of Countries: 004 Number of Patents: 007 
Patent Family: 



Patent No 


Kind 


Date 


Applicat No 


Kind 


Date 


Week 


GB 


^236232 


A 


1991032 / 


GB 


9U1 / DDI) 


A 


1 AAAAOI A 

19900810 


T A A 1 1 O r-> 

199113 B 


JP 


4002254 


A 


19920107 


JP 


ftfto >l 1 1 OA 

90o411oU 


A 


T AAAT 1 OA 

199011 3U 


T A A O T T 

199211 


US 


5163084 


A 


19921110 


US 


90563481 


A 


i r\ r\ r\ f\ o r\ i 

19900807 


199248 


KR 


9205581 


B 


19920709 


KR 


8911435 


A 


19890811 


199309 


KR 


9300593 


B 


19930125 


KR 


903367 


A 


19900314 


199341 


US 


5255310 


A 


19931019 


US 


90563481 


A 


19900807 


199343 










US 


92931135 


A 


19920817 




GB 


2236232 


B 


19940302 


GB 


9017560 


A 


19900810 


199407 



Priority Applications (No Type Date): KR 903367 A 19900514; KR 8911435 A 



19890811 
Patent Details: 

Patent No Kind Lan Pg Main IPC 
JP 4002254 A 10 
US 5163084 A . 12 H04M-001/64 
US 5255310 A 10 H04M-001/64 



GB 2236232 B 
KR 9205581 B 
KR 9300593 B 



3 H04M-003/42 
H04M-001/23 
G06F-015/40 



Filing Notes 



Cont of application US 90563481 
Cont of patent US 5163084 



Abstract (Basic) : GB 2236232 A 

In the system a telephone (1) .is provided to transmit a Dual Tone 
Multi-Frequency (DTMF) signal by utilising a character panel including 
several buttons (30) which are orderly arranged therein. An exchange 
82) is provided to switch the DTMF signal. A DTMF receiver apparatus 
(4) converts the DTMF signal into a corresponding digital signal. A 
key word storage appartus (6) includes a service name file unit for 
storing a service name file, and a key word dictionary unit for 
storing a key word dictionary. 

A text information storage apparatus (7) stores several 
information data corresponding to each service name. A central 
Processor Unit converts a digital signal outputted from the DTMF 
receiver apparatus into an input character string to match the input 
character string with the key words stored in the key word 
storage apparatus in order to provide information data. A voice output 
apparatus is connected from the Central Processor Unit, for converting 
a digital voice data signal into a voice signal to provide the desired 
information service to the user. 

ADVANTAGE - Facilitates inputting of a continuous character string 
without making a distinction between component words. 
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Abstract (Basic) : WO 9016036 A 

The document information retrieval method of effecting full text 
search has an apparatus with a magnetic disc device. Two-step presearch 
of documents is effected with respect to a key - word for the 
retrieval. In the first step of the presearch, a character table 
describing, by documents, the presence or absence of all the character 
codes included in a group of text data of the documents stored is 
generated in advance. The character table is searched using all 



character codes that constitute the keyword , and only the documents 
including the character codes are picked up. 

In the second step, compressed text data excluding annexed words 
contained in the text data and repetetively appearing words are 
generated, and documents containing the keyword as a word are picked 
up out of the documents picked up in the first step. After the second 
step (step 403), a text search (step 404) is effected according to 
proximity condition, context condition, etc. 
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Computer process for ranking word similarities - measuring number of 
basic operations needed to convert input to dictionary key word , and 
length of identical segments to develop score 
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Abstract (Basic) : EP 271664 A 

Morphological mapping generates keys which have similarities that 
can be detected during a subsequent ranking procedure. The mapping is 
defined so that unique consonants of the input word are listed in their 

original order followed by the unique vowels for the input words, 
also in their original order. The consonants in the keys are arranged 
in alphabetical order followed by arranging the vowels in the keys in 
alphabetical order. 

A ranking technique is applied which makes use of a compound 
measure of similarity for ranking the key words . By first 
measuring the number of basic operations needed to convert an 
input-derived key word into a dictionary-derived key word (the 
higher the number, the less similar are the words) and then secondly 
measuring the length of identical character segments in each pair 
of key words being matched, there is developed a scoring system. 
The latter thus ranks the similarity of an input word to 



dictionary-derived key words . 

ADVANTAGE - Mapping is insensitive to consonant/consonant 
transpositions as well as consonant/ vowel transpositions and doubled 
letters . 
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Abstract: A regional information guidance system has been developed on an 
image workstation. Two main features of this system are hypermedia 
data structure and friendly visualrinterf ace realized by the full-color 
frame memory system. As the hypermedia data structure manages regional 
information such as maps, pictures and explanations of points of interest, 
users can retrieve those information om by one 7 , pext to next according to 
their interest change. For example, users carr retrieve explanation of a 
picture through the link between pictuf es\and text explanations. Users can 
also traverse from one document to anothe^/S^y using keywords as cross 
reference indices. The second feature is/to \itilize a full-color, high 
resolution and wide space frame memory /for vrteual interface design. This 
frame memory system enables real-time ^operation of image data and natural 
scene representation. The system also/provides yialf tone representing 
function which enables fade-in/out p/esentation^s . This fade-in/out 
functions used in displaying and erasing menu ana image data, makes visual 
interface soft for human eyes. The/ system we have\ developed is a typical 
example of multimedia application^ . We expect the\image workstation will 
play an important role as a platyform for multimedia applications. (Author 
abstract) 12 Refs. \ 
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Abstract: An information system for\quality control and related fields 
consists of two - parts- procedu/es fair indexing and retrieval of 
information from the published li/teratu^e, and data and reports generated 
within an organization. SWIFT LMSS (Signa^Word Index of Field and Title, 
Literature Abstract Specializedr Search), described in Part 1 of this paper, 
is an example of the first k/Lnd of procedure^ SWIFT SIR (Signal Word 
Index of Field and Title, Scientific Inf ormatiohv Retrieval ) , described in 
Part 2, is an example of the second. SWIFT is an adaptation of the KWIC ( 
Key Word in Context) information retrieval system. 
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In this dissertation, we present the results of our work that seeks to 
negotiate the gap between low-level features and high-level concepts in 
both content-based image retrieval and web document retrieval. This work 
concerns a technique, Latent Semantic Indexing (LSI), which has been used 
for textual information retrieval for many years. In this environment, LSI 
is used to determine clusters of co-occurring keywords , sometimes, called 
concepts, so that a query which uses a particular keyword can then 
retrieve documents perhaps not containing this keyword , but containing 
other keywords from the same cluster. In this dissertation, we first 
examined the use of this technique for content-based image retrieval, using 
various visual features, namely, global color histogram, subimage color 
histogram, and color angiogram to represent the image contents. LSI is used 
to transform the image feature representation into a semantic space. The 
transformed representation of the images in this lower-dimensional space 
captures the underlying semantic structure of image contents better than 
the original feature representation by finding the correlation of low-level 
features and high-level concepts. We have also examined the use of the LSI 
technique for web document retrieval in a similar process, using both 
keywords and image features to represent the documents. Two different 
approaches to image feature representation, namely , color histogram and 
color angiogram, are adopted and evaluated. Experimental results show that 
LSI, together with both textual and visual features, is able to extract the 
underlying semantic structure of web documents, thus helping to improve the 
retrieval performance significantly. Based on these research works we 
firmly believe that negotiating the semantic gap between low-level features 
and high-level concepts using latent semantic indexing is a promising and 
feasible approach to improving content-based retrieval, and thus, 
developing more effective and more intelligent multimedia content 
management systems. 
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Content-based retrieval from image databases has become a popular 
research area where conventional database retrieval methods are not 
sufficient because they depend/ on exact matches 6f keywords and require 
an enormous amount of human inivolvement during manual annotation. Initial 
work on content-based retrieval focused on using \Jow-level features like 
color and texture for image representation, and a geometric framework of 
distances in the feature space for similarity. A challenging problem in 
image retrieval is the fusion of information from multiple features and 
similarity measures. In this dissertation, we pose the retrieval problem in 
a probabilistic framework where the goal is to minimize the classification 
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General requirements are formulated for techniques of keywords selection 
in coordinate indexing. These requremerits serve as the methodological 
groundwork for designing and improving the keyWord selection techniques 
. An algorithm for evaluating the/ quality orVthe document search 
pattern is proposed; first, by a word-by-word matching of a complete 
document text against a relevant subject request, \ standard document 
search pattern is compiled, and, ne^xt, the patterns\t:o be assessed are 
compared with the standard. The technique that yieldk a search pattern 
having the maximum number of keywords in common with standard is considered 
the optimum keyword selector. 
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Publication Date: 1968 

ISSN: 0040-6872 

Language: Swedish 

Document Type: Journal Article 

Record Type: Abstract 

Journal Announcement: 0500 

In 1966, the first stage in the development of this information 
retrieval system for bibliographic information storage and retrieval was 
completed. Later this system was converted into a universal system easily 
adaptable to any external information sources ., Data are transferred to a 
magnetic tape as required by the second part of the abacus, the search and 
advice phase. The user expresses the subject of his request by using 
keywords and by indicating the article containing references to sources 
which might be of interest to him. The indications are used to form an 
information request which is then assigned a code number for use in further 
search. The search can be carried put by author's name, by journal title, 
and so on. By using punched cards or punched tape, a request is transferred 
onto magnetic tape. Information is fed to the i.r. System in two parts: 
the fixed field descriptive one, and the variable field search one; 
bibliographic data are recorded in the variable field part. Searching is 



conducted by words, word combinations , or phrases in any codable language. 
The i.r. System has facilities for translation of information from any 
source into the code used in the i.r. System. Certain limitations of the 
abacus system are indicated. The main phases in the development of the 
second part of abacus are enumerated. 

Classification Codes and Description: 5.00 (General Aspects) 
Main Heading: Information Processing and Control 
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DIALOG(R) File 202:Info. Sci. & Tech. Abs . 
(c) 2004 EBSCO Publishing. All rts . reserv. 

0100179 V 

Technical information retxreval and tappi publications . 

Author (s): Kouris, Michael \^ / 

Corporate Source: Technical Association Of Tne Pulp And Paper Industry 

Tappi vol. 49, no. 5, pages 142V-143a / 

Publication Date: May 1966 \ / 

ISSN: 0734-1415 \ / 

Language: English \/ 

Document Type: Journal Article /\ 

Record Type: Abstract / \ 

Journal Announcement: 0100 / \. 

Reviews the documentation-oriepfted activities of tappi, including (1) the 
annual tappi bibliography of papermaking & u.\. Patens; (2) the monthly 
tappi magazine with two new /features , namely , indexing at the source 
by listing of thesaurus-based/ keywords for each original research 
article, and listing of manu/cripts received, whiWi are available as 
photocopies prior to publication; and (3) the establishment of an ad hoc 
committee on information reftrieval which sponsored k symposium on technical 
information management and (retrieval at the february ^966 annual meeting of 
tappi in new york. 

Classification Codes and Description: 1.01 ( Primary and Secondary 
Sources ) 

Main Heading: Information Science and Documentation 
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04284110 INSIDE CONFERENCE ITEM ID: CN0^4908712 

A new family of Comment z -Walter-^tyle irafltiple- keyword pattern 
matching algorithm \. / 

Watson, B. W. 

CONFERENCE: Prague Stringology CLdb/VPSCW 1 2000-Workshop 

COLLABORATIVE REPORT-CZECH TECHNTCAL IHTCVERSITY DEPARTMENT OF COMPUTER 

SCIENCE AND ENGINEERING DC, 20j/o P: 71>^6 

Prague, Czech Tech. Univ, 2000 \. 

LANGUAGE: English DOCUMENT /yPE: Conf erence^apers 
CONFERENCE EDITOR(S): BaLik, M. ; Simanek, M.N. 
CONFERENCE SPONSOR:' Prague Stringology Club \ 
CONFERENCE LOCATION: Prague 2000; Sep (200009) (200009) 

BRITISH LIBRARY ITEM LOCATION: 3298.375000 
DESCRIPTORS: PSCW; stringology 
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7244963 INSPEC Abstract Number: C2002-05-7440-104 



Class Codes: C7250R (Information retrieval 
engines); C6120 (File organisation) 
Copyright 1999, IEE ■ 



techniques ) ; 



C7250N (Search 
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DIALOG (R) File 2:INSPEC y< 
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6172883 INSPEC AbstractNN umber : C1999-04-7250R-002 

Title: Method of selecting optimal characteristic values for browsing' 
search \ / 

Author(s): Kakimoto, T.; Kambayashi, Y/ 

Author Affiliation: Fuj itsuALabs . Ltd<, Toyota, Japan 

Journal: Transactions of the Institute of Electronics, Information and 
Communication Engineers D-I Vol . JJBzD-I, no.l p. 130-9 
Publisher: Inst. Electron. Inf\ yCommun. Eng, 
Publication Date: Jan. 1999 Country of Publication: Japan 
CODEN: DTRDES ISSN: 0915-1915 A 
SICI: 0915-1915 (199901) J82DI: VL. \30:MSOC; 1-1 
Material Identity Number: M972-19Si9-002 
Language: Japanese Document Typee: Journal Paper (JP) 
Treatment: Practical (P) / \ 

Abstract: In the case of retrieving data objects from digital libraries 
or databases on the WWW, xt is difficult to search for the required object 
using only the keyword/ search. In \he keyword search AND search uses 
characteristic valuers like keywords . In order to select these 

characteristic values, i/c is necessary to check the retrieved data. To this 
end the browsing process is very useful. We propose a method of selecting 
the characteristic values optimized for thrfe browsing search and a method 
for the evaluation. We^how-the usefulness of this method by applying it to 
three dissimilar data sets. (8 Refs) 
Subfile: C 

Descriptors: digital libraries; information resources; information 
retrieval; Internet 

Identifiers: optimal characteristic value selection; browsing search; 
digital libraries; databases; World Wide Web; keyword search; data sets; 
information retrieval; Internet 

Class Codes': C7250R (Information retrieval techniques); C7210N ( 
Information networks) 

Copyright 1999, IEE 
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DIALOG (R) File 2: INSPEC 

(c) 2004 Institution of Electrical Engineers. All rts. reserv. 

5783611 INSPEC Abstract Number: C9802-7250R-002 

Title: An efficient algorithm for full text retrieval for multiple 
keywords 

Author(s): Aoe, J. -I. 

Author Affiliation: Dept. of Inf. Sci. & Intelligent Syst., Tokushima 
Univ. , Japan 

Journal: Information Sciences vol.104, no. 3-4 p. 345-63 
Publisher: Elsevier, 

Publication Date: Feb. 1998 Country of Publication: USA 

CODEN: ISIJBC ISSN: 0020-0255 

SICI : 0020-0255 (199802) 104 : 3/4L . 345 : EAFT; 1-6 

Material Identity Number: 1132-97014 

U.S. Copyright Clearance Center Code: 0020-0255/ 98/$19 . 00 
Document Number: S0020-0255 ( 97 ) 00064-9 

Language: English Document Type: Journal Paper (JP) 
Treatment: Theoretical (T) 

Abstract: Text retrieval methods have attracted much interest recently. 
There are numerous applications involving storage and retrieval of textural 
data: electronic office filing, computerized libraries, automated law, and 
so on, A well-known and simple approach of searching texts is full text 



retrieval using signature files, but the method cannot apply multiple 
keywords. This paper presents a fast retrieval algorithm for multiple 
keywords by using the characteristics of multiple signatures. The 
objective of this approach is to decrease the number of comparisons between 
multiple signatures. From the simulation result for OR and AND-OR 
operations and for less than 40 keywords, it is shown that the presented 
algorithm is from two to six times faster than the traditional algorithm. 
(16 Refs) 
Subfile: C 

Descriptors:" full-text databases; information retrieval; string matching 

Identifiers: full text retrieval; multiple keywords; electronic office 
filing; computerized libraries; automated law 

Class Codes: C7250R (Information retrieval techniques); C4240 ( 
Programming and algorithm theory); C6130 (Data handling techniques) 

Copyright 1997, I EE 
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DIALOG ( R) File 2 : INSPEC 

(c) 2004 Institution of Electrical Engineers. All rts . reserv. 

5464918 INSPEC Abstract Number: C9702-6130-024 
Title: The performance of single and multiple keyword pattern matching 
algorithms \ * 

Author(s): Watson, B.W. \ / 

Author Affiliation: Ribbit Software Syst. Inc., iCanada 

Conference Title: Proceedings of the Third youth American Workshop on 

String Processing. WSP 1996 p\280-94 / / 

Editor(s): Ziviani, N.; Baeza-Yates, R. ; Guimarae's, K. 
Publisher: Carleton University P.ress, Ottawa/ Ont., Canada 
Publication Date: 1996 Country of Publication/: Canada 294 pp. 
ISBN: 0 88629 308 1 Material \ldentity Number: XX96-02514 

Conference Title: Proceedings of\ Third Spdtn American Workshop on String 

Processing. WSP ! 96 \ / / 

Conference Date: 8-9 Aug. 1996 Gonfe^eryce Location: Recife, Brazil 
Language: English Document Type :\ Corif erence Paper. (PA) 
Treatment: Practical (P) V/ / 

Abstract: This paper presents performance data on some pattern matching 
algorithms, and recommendations f oryxhe^election of an algorithm (given a 
particular application) . The pattern mashing problem, and algorithms 
solving it, are considered extensively in Watson (1995) . The performance of 
all of the algorithms (running on/a variety o€ workstation hardware) was 
measured on two types of input :/ English text Vd genetic sequences. The 
input data, which is the same as that, used in the benchmarks of Hume and 
Sunday (1991), were chosen to be^ representative of\two of the typical uses 
of pattern matching algorithms. The differences between natural language 
text and genetic sequences/ ser,ve to highlight the st\engths and weaknesses 
of each of the algorithms. Until now, the performance of the 
multiple-keyword algorithms' (Aho-Cprasick (1975) and Commentz-Walter 
(1979)) had not been extensively measured. The Knuth-Morris-Pratt (1977) 
and Aho-Corasick algorithms /performed linearly and consistently (on widely 
varying keyword sets)/ as^their theoretical running time predicts. The 
Commentz-Walter algori/thm (and its variants) displayed more interesting 
behaviour, greatly outperforming even the best Aho-Corasick variant on a 
large portion of theVinput data. The recommendations section of this paper 
details the conditions under which a particular algorithm should be chosen. 
(17 Refs) 

Subfile: C 

Descriptors: pattern matching; search problems; software performance 
evaluation; string matching 

Identifiers: keyword pattern matching algorithms ; performance data; 
genetic sequences; natural language text; Aho-Corasick algorithm; 
Commentz-Walter algorithm; Knuth-Morris-Pratt algorithm 

Class Codes: C6130 (Data handling techniques) 

Copyright 1997, IEE 
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02255338 JICST ACCESSION NUMBER: 95A0031339 FILE SEGMENT: JICST-E 
The Fast Algorithm of Full Text Retrieval for Multiple Keywords. 

ARITA TAKESHI (1); TSUDA KAZUHIKO (1); IRIGUCHI HIROKAZU (1); AOE JUN f ICHI 
(1) 

(1) Univ. of Tokushima, Fac. of Eng. 

Joho Shori Gakkai Kenkyu Hokoku, 1994, VOL. 94, NO. 98 (NL-104 ) , PAGE. 47-54, 

FIG. 8, REF.21 
JOURNAL NUMBER: Z0031BAO ISSN NO: 0919-6072 
UNIVERSAL DECIMAL CLASSIFICATION: 681.3:80 002.5:005 
LANGUAGE: Japanese COUNTRY OF PUBLICATION: Japan 

DOCUMENT TYPE: Journal 
ARTICLE TYPE: Original paper 
MEDIA TYPE: Printed Publication 

ABSTRACT: Text retrieval methods have attracted much interest recently. 
There are numerous applications involving storage and retrieval of 
textual data: Electronic office filing, Computerized libraries, 
Automated law and so on. A well-known and simple approach of searching 
texts is full text retrieval using signature files, but the method can 
not apply a finite number of keywords. This paper presents a fast 
retrieval algorithm for multiple keywords by using characteristic 
of multiple signatures. The algorithm decreases the number of 
comparisons between multiple signatures. From the simulation result, it 
is show that the algorithm presented is from 10 to 17 times faster than 
the traditional approach for from 16 to 32 multiple keywords, (author 
abst . ) 

DESCRIPTORS: automatic language processing; information retrieval; keyword; 
vector (mathematics) ; speedup; pattern matching 

BROADER DESCRIPTORS: computer application; utilization; information 

processing; treatment; retrieval; vocabulary; linear algebra; algebraic 
system; modification; improvement; matching (graph) ; matching 

CLASSIFICATION CODE ( S ) : JE06000L; AC06020S 
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01796943 JICST ACCESSION NUMBER: 93A0706163 FILE SEGMENT: JICST-E 
On Natural Language Medical Data Management System. 

NISHIMURA YASUSHI (1); ODA SEIO (1); SHIRAISHI MASATO (2); YOKOTA MASAO (3) 
(1) Fukuokakogyotankidaigaku; (2) Fukuoka Univ. of Education; (3) Fukuoka 
Inst, of Technology 

Denshi Joho Tsushin Gakkai Gijutsu Kenkyu Hokoku (IEIC Technical Report 
(Institute of Electronics, Information and Communication Enginners), 
1993, VOL.93,N0.132(NLC93 31-41), PAGE. 1-8, FIG. 5, TBL.2, REF.7 

JOURNAL NUMBER: S0532BBG 

UNIVERSAL DECIMAL CLASSIFICATION: 681.3.02:61 681.3:061.68 
LANGUAGE: Japanese COUNTRY OF PUBLICATION: Japan 

DOCUMENT TYPE: Journal 
ARTICLE TYPE: Original paper 
MEDIA TYPE: Printed Publication 

ABSTRACT: The authors have been accumulating medical information such as 

discharge summaries and so on, in order to realize the computer system 
which understands queries in natural language for data bases. Analyzing 
the question sentences from doctors, it has turned out that the 
seconary information given a statistical processing to the medical 
treatment data is expected by them. This time, we developed a prtotype 
of natural medical data management system which had these statistical 
processing function, and we report the outline of it. In this system, 
by making good used of the grammatical characteristic seen in the 
discharge summaries, it extracts the attribute value of keyword 
by the method beforehand and make the irectrieval dictionary. The result 
of that, the system ahieves a high speed processing, (author abst . ) 



DESCRIPTORS: medical information processing system; query answering system; 
data management; automatic language processing; database; keyword; 
patient 

BROADER DESCRIPTORS: information system; computer application system; 

system; management; computer application; utilization; information 

processing; treatment; vocabulary; human ( sociology) 
CLASSIFICATION CODE(S): JE15030Q; JD03030U 
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00486555 JICST ACCESSION NUMBER: 87A0479167 FILE SEGMENT: JICST-E 
On two inference methods for the document retrieval and their application 
to a OA system. 

MORINAGA HIROSHI (1); KOBAYASHI KIYOHIRO (1); KANAOKA TAIHO (1); TOMITA 
SHINGO (1); TAKAHASHI MASASHI (2); KAWAKAMI NOBORU (2); IMAMURA 
TSUGUYASU (2); TANOUE FUMIROU (2) 
(1) Yamaguchidai Ko; (2) Chugokunihondenkisof utoea 

Denshi Joho Tsushin Gakkai Gijutsu Kenkyu Hokoku(IEIC Technical Report 
(Institute of Electronics, Information and Communication Enginners), 
1987, VOL. 87, NO. 101, PAGE . 37-42 (OS87-13 ) , FIG. 10, TBL.l, REF.4 

JOURNAL NUMBER: S0532BBG 

UNIVERSAL DECIMAL CLASSIFICATION: 002.5 681.3.02:651.2 
LANGUAGE: Japanese COUNTRY OF PUBLICATION: Japan 

DOCUMENT TYPE: Journal 
ARTICLE TYPE: Original paper 
MEDIA TYPE: Printed Publication 

ABSTRACT: This paper proposes a new type of document retrieval system which 
includes some kind of inference module. In current retrieval systems, 
at first the user must input the keywords or attributes concerning 
his desired documents. At that time, he must make efforts to input 
character exactly and to understand the construction of the attribute's 
value. Moreover those systems enforce to input other characters upon 
the user, when he can not get the desired document. To improve those 
problems, we try to develop a new retrieval system including two 
inferencing technique called character inference and common item 
inference. By using this system, the user will be stimulated his desire 
to operate the system successively and enable to search his requirement 
effectively. (author abst.) 

DESCRIPTORS: data retrieval; data retrieval system; artificial intelligent ■ 
inference; attribute; module; OA(office) 

BROADER DESCRIPTORS: fact retrieval; information retrieval; retrieval; 

information retrieval system; information system; computer application 
system; system; inference; property; mechanization; automation; 
modification 

CLASSIFICATION CODE(S): AC06010H; JE12000O 
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39/3, K/5 (Item 5 from file: 647) 

DIALOG (R) File 647: CMP Computer Fulltext 
(c) 2004 CMP Media, LLC. All rts . reserv. 

01054970 CMP ACCESSION NUMBER: WIN19950701S0036 

In brief - Internet Access (New Products - Remote Access) 

WINDOWS MAGAZINE, 1995, n 60, PG69 

PUBLICATION DATE: 950701 

JOURNAL CODE: WIN LANGUAGE: English 

RECORD TYPE: Fulltext 
SECTION HEADING: departments 
WORD COUNT: 116 

TEXT: 

Sports- that you can access by pointing and clicking on the 
appropriate icon. You can search by keywords and attribute values . 
Or add other addresses you've found and organize the whole database by 
country, institution type, service and main subject. You must have 
direct access to the Internet, rather than via an online service, to use 



39/3, K/7 (Item 7 from file: 647) 

DIALOG (R) File 647: CMP Computer Fulltext 
(c) 2004 CMP Media, LLC. All rts. reserv. 

01022114 CMP ACCESSION NUMBER: WIN19940601S1845 
In Search of the Perfect fcCM / 
James E. Powell \ / 

WINDOWS MAGAZINE, 1994, n 5\)6 , 264 / 
PUBLICATION DATE: 940601 \ / 
JOURNAL CODE: WIN LANGUAGE: English / 

RECORD TYPE: Fulltext \ / 

SECTION HEADING: Reviews \ / 

TEXT : \ / 

packages are intended toNbalaEice the features of schedulers and 
address books and, in some cases, \ada features to link these two 
resources together. It isn't alwaysv/a clear line, of course. Delrina's 
Daily Planner and Individual ... strona for example, you can choose to print 
only those entries beginning with/M. XThe program's primary emphasis 
isn't meant to be on traditional/business features, such as time or 
contact management: At... / \ 

...this case, it's the appointment scheduler that fails to live up to 
competing programs. Oddities abound. First \f all, it doesn't use time 
slots; instead, you click on/a time listed in a... 

...Here's a program that lexs you enter, find atid display information in 
every way imaginable. The /primary screen is a \preadsheet (table) 
version of the address book that lists data in user-selectable columns... 
hour seminar. There is no /search option, either, so\we couldn't find which 
appointment contained the/ keyword widgets. Daily P\anlt offers several 
appointment display formats, from one- and two-day toN^eek- and two... 

...user. The opening screen displays a calendar, appointment list, to-do 
list and telephone book on the main screen. Drag the name from your 
phone book to the appointment area, and the new appointment dialog... 

39/3,K/24 (Item 16 from file: 275) 

DIALOG (R) File 275: Gale Group Compute rDB (TMU 
(c) 2004 The Gale Group. All rts. reserve. / ' 

02032708 SUPPLIER NUMBER: 19030804 /(U&E^FORMAT 7 OR 9 FOR FULL TEXT) 

Textbases deliver Web results. (Review af products of fering text storage 

indexing and retrieving capabilities)/ (special su|>plement: Internet 

Systems) (Software Review) (Evaluation) \. 



Spitzer, Tom \ X 

DBMS, vlO, nl, pS13(5) \\ / 

Jan, 1997 \ / 

DOCUMENT TYPE: Evaluation I^N: 1042^-5173 LANGUAGE: English 

RECORD TYPE: Fulltext; Abstract V\ / 
WORD COUNT: 4 678 LINE COUNT: XoOBtn 

use this feature, for exarngfleyto set up an employee skills 
inventory; for each employee, the/ skilrs field could have multiple 
values - / \. 

In addition to keyword /searches, the DB/TextWorks engine searches 
for keyword pairs referred to<as terms, word stems, phrases, and... 
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DIALOG (R) File 275: Gale Group Computer DB(TM) 
(c) 2004 The Gale Group. All rts . reserv. 



02000988 SUPPLIER NUMBER: 18849088 

Desperately seeking surfers ; Web programmers try to alter search engines 1 
results. (Web site developers attempt to alter Web search results) 
(Internet /Web/ Online Service Information) 

Flynn, Laurie J. 

New York Times, vl46 , Mon ed, col 4, pC5 (N) pD5(L) 
Nov 11, 1996 

ISSN: 0362-4331 LANGUAGE: English RECORD TYPE: Abstract 

. . .ABSTRACT: alter search statistics in order to place their sites at the 
top of lists displayed when a user f s search is completed. Consultants 
offer suggestions on how to accomplish this, while the developers of search 



...stay one step ahead of them. The most common approach is to load a site 
with specific keywords , a technique referred to as 1 keyword 
stuffing 1 . The keywords are hidden, sometimes behind graphics or by 
displaying them in black against a black background. The search engine will 
count the keywords and display the site higher in the relevancy ranking. 
Repeating the words several times increases the count. Some Web site 
administrators have placed complete dictionaries on the first page of 
their site. The approach diminishes the credibility of the search process 
and irritates users. Most search engines now employ filters that recognize 
repetition and other keyword stuffing techniques . 



39/3, K/27 (Item 19 from file: 275) 

DIALOG (R) File 275: Gale Group Computer DB(TM) 
(c) 2004 The Gale Group. All rts. reserv. 

01951874 SUPPLIER NUMBER: 18414884 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

World wide FileMaker, (linking Web sites to FileMaker Pro database) 
(Internet/Web/Online Service Information) 

Brisbin, Shelly 

MacUser, vl2, n8, pl07(3) 

August, 1996 

ISSN: 0884-0997 LANGUAGE: English RECORD TYPE: Fulltext; Abstract 

WORD COUNT: 1390 LINE COUNT: 00110 

... the search on the search- results page (see the "Instant Web Pages" 

sidebar) . 

The simplest database requires two new calculation fields . The 
first delivers a list of all records found to match a keyword search. 
The second calculation generates the HTML to display the contents of an 
individual record (for the specific item the user selects in the list 
generated by the first search) . You need some knowledge of FileMaker 
calculations to create these fields, but CGI-application writers usually. • . 



39*/3,K/30 (Item 22 from file: 275) 

DIALOG (R) File 275: Gale Group Computer DB(TM) 
(c) 2004 The Gale Group, All rts . reserv. 

01862628 SUPPLIER NUMBER: 17581438 (USE FORMAT 7 OR 9 FOR FULL TEXT) 

Future database technologies now. (visual query systems) (Technology 
Information) 

Frank, Maurice 

DBMS, v8, nl2, p52(5) 

Nov, 1995 

ISSN: 1041-5173 LANGUAGE: English RECORD TYPE: Fulltext; Abstract 
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... arm's-length process. Historically, users trying to find multimedia 

objects have had to rely on associated keywords , primary -key values / 
or other alphanumeric fields on the same record. ^ 

Unfortunately, these indirect surrogates often make poor handles 
because different people describe images... 
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Underware for groupware . (Lotus Development Corp. 's Lotus Notes 
application) 

Dyson, Esther 

RELease 1.0, v92, n8, p7{6) 
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retrieval system. Its data-structuring facilities make the 
information more intelligible to humans. They can query by keywords or 
values in fields such as date, topic, author, skills in a resume or the 
name of an applicant. Notes also. . . 
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Weisfeld, Matt; Gilson, Michael J. 
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... the parsing is done is not a major concern for this discussion. It 

is part of the user interface, not the database system. However, the 
information that the line contains is crucial. The search is interested in 
two pieces of information: the keyword price and the value 33.3. First 
, you must determine if this information is valid. If price is not a field 
in the master structure, there is nothing with which to compare it. Second 
, it must be predetermined that, at least in this case, price is a 
floating-point number. Under no circumstances can price represent more than 
one scalar data type in the same application. Third, keyword price in the 
master structure must be located such that the search can operate on the 
proper... the price (float) 0.0, and the code (string) zero 1 . Notice that 
this transaction conforms to the rules for valid keywords and scalar 
data type in the field. . . 



. . .ptr++; 

To illustrate further, assume that the search is for a keyword 
price with the value of 22.2. The address of the first value of price 
is already known by inspecting the initial address. See Figure 2. A 
comparison reveals that tran-record [ 0] has a price of 0. 0... record, thus 
providing the number of members of the array 1 

The two character pointers make up the user-supplied key. For 
example, if the search criterion was to find all... 

...value and price would be the user- keyword . A user query such as find 
all price = 33.0 could be parsed to supply this information. Again, because 
the search algorithm is the primary focus here, this information will be 
generated internally, relieving us of the user interface, 
type. . . 



39/3, K/58 (Item 50 from file: 275) 

DIALOG (R) File 275: Gale Group Computer DB(TM) 
(c) 2004 The Gale Group. All rts . reserv. 

01422836 SUPPLIER NUMBER: 09767979 (USE FORMAT 7 OR 9 FOR FULL TEXT) 
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argv[3] = "LINEBUFSIZE=1024 " argv[4] = "OPENLINES=20" argV[5] = 
"AUDITING=NO" 

As you can see, the keywords are not in any specific order or case. 
LAN Manager preprocesses the keywords to remove blanks and converts the 
separating colon to an equal-sign (the user may specify either a colon or 
an equal-sign when entering the keyword and its value ) . Any text 
following the equal-sign or colon is treated as the keyword value , 
including comments, and is passed to your service. If while overriding a 
keyword in LANMAN.INI the user enters only a part of the keyword name 
, LAN Manager still passes the full keyword from LANMAN.INI (if it 
exists). Since explicitly specified keywords always come first , you may 
want to ignore second and subsequent occurrences of a keyword ; this 
allows the user to override values in LANMAN.INI using short-cuts when 
starting the service. 

Making a Service Known. . . 
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New PIM indexes files, free-form data. (Bananafish Software's 

ThoughtPattern personal -information management system) (product 

announcement ) 

Cohen, Raines 
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. . . given topic shows all types of items on that topic, including 

files . 

The concept of indexing a user ! s information in a central location 
is not new to the Macintosh. Main -stay's Marco Polo provides both 
indexing and compression; ON Technology Inc. 's On Location automatically 



indexes file names and contents and lets users view many kinds of files; 
and Kiwi Software Inc. 's KiwiFinder Extender builds a variety of 
file-classification and search techniques , including keywords , into 
standard dialog boxes. 

Bananafish Software is at 730 Central Ave., San Francisco, Calif. 
94117. Phone (415... 
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The big index. (Some Assembly Required) (tutorial) 

Wayner, Peter 
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ABSTRACT: Techniques for creating a keyword index to manage files 
using a combination of tree, trie and linked-list data structures are 
presented. The scheme includes a list of the filenames on the disk, a list 
of the keywords in the files and a set of pointers between the keywords 
and the filenames. There is one entry for each filename and for each unique 
word on the disk; pointers link the two lists. Each file has a unique ID 
number, and each file tree node contains the name of a file or directory 
and three pointers to other nodes. The sample index program is limited... 

...13 bits of information. A 'trie 1 is an alphabetically sorted tree where 
26 roots correspond to the first letters of the words; letters in the 
nodes along paths from the roots to the leaves are... 
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All the news that fits, (information retrieval services) 
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. . .ABSTRACT: simple keyword in context (kwic) technology. Individual Inc 
uses software based on information theory to search text; key words are 
given weighted values . Highly rated articles are selected for 
Individual's First ! newsletter. Yosi Amram will soon have a great deal of 
company in the information service business; the... 
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language ) ( technical ) 
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Communications of the ACM, v33, n8, p50(22) 
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DOCUMENT TYPE: technical ISSN: 0001-0782 LANGUAGE: ENGLISH 

RECORD TYPE: FULLTEXT; ABSTRACT 

WORD COUNT: 14427 LINE COUNT: 01130 



... outstanding problems of artificial intelligence. Scan useLucy for 

sentence-level understanding, supplementeing, but not completely replacing, 
standard key - word techniques . Lucy can perform two different 
operations that are important to Scan. The first , which has already been 
implemented, extends the idea of query expansion as described in [21] and 
[56]. Lucy understands the user ! s query and maps it into a CycL 
expression that can be transformed and expanded to include... 

...which generates a new set of Boolean queries, which it passes off to 
standard retrieval engine. The second operation, which we will pursue 
later, is to understand at least fragments of the stored texts themselves 
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... mutable expression is bound to the symbol EfcqrikAccount . The 

generator allows creation of instances via a create ^expression . 
Following the keyword mutab/e is a sequence of identifiers for the state 
variables of an instance of BankAccount . In this... 



